Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9611 / 000028_owner-urn-ietf _Tue Nov 5 04:48:36 1996.msg < prev next >

Wrap

Internet Message Format | 1997-02-19 | 2KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id EAA16345 for urn-ietf-out; Tue, 5 Nov 1996 04:48:36 -0500 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id EAA16340 for <urn-ietf@services.bunyip.com>; Tue, 5 Nov 1996 04:48:33 -0500 Received: from nic.cafax.se by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA00389 (mail destined for urn-ietf@services.bunyip.com); Tue, 5 Nov 96 04:48:29 -0500 Received: from nic.cafax.se (paf@nic.cafax.se [193.12.122.42]) by nic.cafax.se (8.8.2/8.8.2) with SMTP id KAA06251; Tue, 5 Nov 1996 10:46:18 +0100 (MET) Date: Tue, 5 Nov 1996 10:46:17 +0100 (MET) From: Patrik Faltstrom <paf@swip.net> X-Sender: paf@nic.cafax.se To: Martin J Duerst <mduerst@ifi.unizh.ch> Cc: jayhawk@ds.internic.net, urn-ietf@bunyip.com Subject: Re: [URN] %encoding for reserved UTF-8 characters (was: New syntax draft) In-Reply-To: <"josef.ifi..324:04.10.96.10.50.51"@ifi.unizh.ch> Message-Id: <Pine.BSI.3.91.961105104008.6238B-100000@nic.cafax.se> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: Patrik Faltstrom <paf@swip.net> Errors-To: owner-urn-ietf@bunyip.com On Mon, 4 Nov 1996, Martin J Duerst wrote: > - Specify that protocols may only reserve 1-octet UTF-8 characters > (i.e. ASCII). This is kind of stupid from my point of view. We do not win anything by doing this, do we? > - Specify that protocols have to define their own escaping mechanisms > for things beyond ASCII. This is better, much better. We say that native URNs are 8-bit following the UTF-8 specification and that the protocol have to come up with its own escaping mechanism. Just because the US-ASCII characters are represented by themselves in an UTF-8 string, and these octets can not occur in any other UTF-8 sequence, I can not see that we have problems with the normal "dangerous" characters in for example a URL, i.e. space, percent, plus etc. A space can only occur in a UTF-8 string as a space, representing a space, so it has to be %encoded. It should because of this be quite easy :-) for each protocol to check what happens when a UTF-8 string is passed. My experience is that UTF-8 strings behaves like ISO-8859-1 strings, if you don't start inserting linebreaks, counting characters etc. Patrik